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ABSTRACT 


This  paper  describes  a  framework  for  the  analysis  of  motion  from  sequences 
of  images  by  computer.  The  framework  includes: 

(a)  A  representation  of  knowledge  for  motion  concepts  that  is  based  on  Minsky’s 
frame  theory  and  semantic  networks; 

(b)  Associated  algorithms  for  recognizing  these  motion  concepts.  These  algo¬ 
rithms  implement  a  form  of  feedback,  by  allowing  competition  and  co-operation  among 
local  processes.  They  also  allow  a  change  of  attention  mechanism  that  is  based  on 
similarity  links  between  knowledge  units. 

The  framework  is  being  realized  with  a  system  called  ALVEN,  whose  structure 
and  operation  are  described.  The  purpose  behind  this  system  is  to  provide  an  evolving 
research  prototype  for  experimenting  with  the  analysis  of  certain  classes  of  biomedical 
imagery,  and  for  refining  and  quantifying  the  body  of  relevant  medical  knowledge. 
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1.0  INTRODUCTION 


1.1  Motivation 


This  paper  describes  the  goals,  achievements  and  current  status  of  ALVEN  (A 
Left  Ventricular  Wall  Motion  Analysis  Computer  Consultant),  a  system  for  the  analysis 
of  cinecardioangiograms  (films  of  the  human  left  ventricle).  Given  such  a  film,  the 
system’s  job  is  to  generate  a  conceptual  description  of  the  shapes  and  motions  exhibit¬ 
ed  in  the  film  by  the  left  ventricular  -wall,  noting  abnormalities  and  unusual  occurences. 
The  general  methodology  used  for  the  development  of  ALVEN  follows  current  trends 
within  A1  which  emphasize  the  representation  and  use  of  knowledge  in  the  design  of  ex¬ 
pert  systems.  Earlier  descriptions  of  ALVEN  have  appeared  in  [Tsotsos77a], 
[Tsotsos77b]  and  [Tsotsos78]. 


From  an  AI  point  of  view,  the  research  conducted  in  the  ALVEN  project  is  con¬ 
cerned  with  the  problems  inherent  in  designing  computer-based  Motion  Understanding 
(MU)  systems  for  reasonably  constrained  problem  areas.  In  our  view,  this  research  has 
the  potential  of  contributing  a  new  framework  for  research  on  Computer  Vision  which, 
unlike  earlier  ones,  (e.g.,  [Winston75]),  accounts  fully  for  the  "pragmatics"  of  Visual 
Understanding,  i.e.,  the  context  within  which  each  incoming  picture  can  be  analyzed. 
Clearly,  such  a  framework  will  not  always  be  applicable.  In  some  cases,  pictures  do 
have  to  be  treated  in  isolation  from  other  pictorial  data  and  then  current  analysis  tech¬ 
niques  involving  "top-down",  "bottom-up",  and  "middle-out"  processing  are  relevant.  It 
is  our  claim,  however,  that  in  many  situations  it  is  motion  information  that  can  serve 
most  usefully  as  a  guiding  principle  for  the  analysis  of  pictures  and  for  generation  of  a 
conceptual  description.  The  importance  of  this  research  for  Computer  Vision  rests  on 
this  claim. 


There  are  a  number  of  more  specific  problem  areas  in  designing  expert  based 
systems  for  medical  image  analysis  that  ALVEN  has  forced  us  to  confront.  These  in¬ 
clude  the  design  of  a  vision  system  and  of  a  knowledge  base  (KB)  that  communicate 
through  a  common  terminology  (in  our  case  "verbal"  descriptions),  and  also  the  design 
of  an  explanatory  component  with  which  the  medical  community  can  feel  comfortable. 


From  a  medical  point  of  View,  ALVEN’s  relevance  is  in  the  assessment  of  the 
human  left  ventricle  (hereafter  LV),  as  a  functioning  muscle.  This  function  is  ideally 
characterized  in  terms  of  the  velocity  profile  of  the  LV,  however  obtaining  these 
profiles  is  an  arduous,  difficult  task. 


A  second  problem  in  the  human  analysis  of  cinecardioangiograms,  in  addition 
to  the  huge  number  of  images  that  must  be  analyzed,  stems  from  the  poor  quality  of 
the  individual  images.  This  is  a  direct  consequence  of  x-ray  dosage  limitations  and 
makes  the  analysis  difficult  to  do  in  an  objective  and  consistent  manner.  These  two 
difficulties  were  the  motivation  for  selection  of  this  problem  domain  as  the  application 
area  for  our  research.  The  process  by  which  cinecardioangiograms  are  made,  and  the 
types  of  analyses  which  cardiologists  produce  from  them,  are  described  in  detail  in  a 
subsequent  section. 


1.2  Goals 
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The  main  goal  of  this  research,  from  an  AI  viewpoint  is  the  development  of  a 
framework  for  motion  understanding  by  computer.  This  framework  is  expected  to  pro¬ 
vide: 


(a)  A  representation  of  knowledge  for  motion  concepts, 

(b)  The  description  of  basic  motion  concepts  such  as  verbs,  shape  transforma¬ 
tions  and  temporal  information  in  terms  of  the  knowledge  representation  proposed, 

(c)  The  development  of  algorithms  for  the  analysis  of  pictorial  data  in  terms  of  a 
motion  knowledge  base.  These  algorithms  will  be  expected  to  generate  a  description  of 
the  objects  and  motions  found  in  the  pictorial  data  at  different  levels  of  abstraction, 
from  detailed  LV  wall  kinematics  to  summary  statements  using  English-like  verbs. 


From  a  medical  point  of  view,  the  aim  of  the  research  is  twofold: 

(a)  To  produce  a  system  that  can  aid  in  the  analysis  of  cinecardioangiograms, 

(b)  To  provide  a  framework  for  the  representation  of  medical  knowledge  about 
the  dynamics  of  the  human  left  ventricle. 

ALVEN  is  a  first  step  towards  achieving  these  objectives. 


1.3  The  Problem  Domain 


Left  ventricular  angiography  is  perhaps  the  most  widely  used  cardiac  imaging 
technique.  Briefly,  the  procedure  is  as  follows.  A  catheter,  a  long  thin  plastic  tube, 
whose  end  is  shaped  appropriately  to  facilitate  its  use,  is  inserted  into  either  the  brac- 
chial  (upper  arm)  or  femoral  (upper  thigh)  artery.  It  is  then  pushed,  turned  and  twist¬ 
ed  by  skilled  hands  until  it  reaches  the  desired  location  of  the  heart.  It  can  be  inserted 
into  any  of  the  coronary  arteries  via  the  aorta,  into  any  chamber  of  the  heart,  or  into 
the  pulmonary  system.  For  the  left  ventricle,  it  is  most  commonly  inserted  through 
the  aorta  and  into  the  cavity.  Once  in  its  desired  location,  a  radiopaque  dye  is  injected 
through  it.  This  dye  mixes  with  the  blood,  thus  creating  on  the  X-ray  image,  light  areas 
wherever  the  dyed  blood  is  present.  Figure  1  shows  such  an  actual  image  with  the 
corresponding  left  ventricular  epicardial  outline.  The  central  blob  is  the  left  ventricle. 
The  aorta  is  the  structure  at  the  top  of  the  LV.  On  the  left  upper  side,  the  outline  is 
quite  sharp  -  this  is  the  location  of  the  mitral  valve.  Traversing  counterclockwise 
around  the  LV,  the  outline  becomes  more  blurred.  Actually,  the  fuzziness  is  due  to  the 
fact  that  not  only  the  inner  wall  (endocardium)  is  visible  but  also  the  outer  wall  (epi- 
cardium)  and  they  are  quite  easily  separated  by  eye.  At  the  lower  left,  the  top  portion 
of  the  diaphragm  is  visible.  Figure  2  shows  an  LV  outline  with  various  features  marked. 
This  figure  should  be  referred  to  when  necessary  during  the  remainder  of  this  docu¬ 
ment. 


A  film  is  taken  in  this  manner  with  60  images  per  second  or  roughly  50  images 
per  beat  for  a  normal  heart.  In  addition,  radiopaque  crosshairs  may  be  placed  on  the 
patient’s  chest  during  the  procedure  thus  providing  an  absolute  point  of  reference  for 
the  motion  observed.  Synchronized  ECG  and  pressure  may  be  taken  with  the  film. 
Pressure  requires  two  simultaneous  catheters:  the  one  described  above  and  one  tipped 
with  a  micromanometer  for  pressure  measurement. 
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Figure  1. 
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Figure  2.  The  Left  Ventricle  and  its  Parts 


NOTE:  1) 


The  centroid  is  defined  by  the  point  which  is  631  of  the  distance  along 
the  longitudinal  axis  from  the  aortic  valve  edge,  in  end-systole.  It  is 
the  point  towards  which  all  LV  wall  points  seem  to  move. 


2)  The  segments  are  defined  by  dividing  the  angle  formed  by  the  mitral  valve 
edge,  the  centroid,  and  the  aortic  valve  edge  into  three  equal  portions. 
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This  imaging  technique  has  several  inherent  problems. 

1)  Because  of  the  angle  of  X-rays  and  the  uncertainty  of  the  exact  distance  from 
the  source  to  the  LV,  corrections  for  magnification  and  distortion  of  the  image  must  be 
determined  and  applied  before  further  processing. 

2)  The  image  actually  seen  does  not  necessarily  correspond  to  a  plane  through 
the  left  ventricle  which  is  perpendicular  to  the  axis  of  the  X-ray  beam.  Consider  a  reg¬ 
ularly  shaped  object,  such  as  a  sphere  filled  with  radiopaque  material.  If  this  is  irradi¬ 
ated  by  a  coherent  X-ray  beam,  the  image  would  be  a  circle  as  expected.  If  it  is  irregu¬ 
larly  shaped,  and  the  X-rays  come  from  a  point  source,  then  we  have  a  situation  in 
which  the  outline  is  both  magnified  and  distorted  -  foreshortened  to  be  more  precise. 
Also,  there  is  no  plane  through  the  LV  which  corresponds  to  the  image  and  is  perpen¬ 
dicular  to  the  beam.  Indeed,  the  plane  need  not  even  be  entirely  in  the  object. 

3)  Dye  mixing  is  often  incomplete  and  the  opacified  blood  does  not  fill  the  cavity. 
Thus,  the  light  areas  may  not  correspond  exactly  to  the  cavity  outline.  In  addition,  ar¬ 
tifacts  of  the  outline  are  caused  by  the  papillary  muscles,  the  trabeculae  on  the  endo¬ 
cardium  and  by  the  cusps  of  the  mitral  valve.  If  any  of  these  parts  of  the  LV  either  trap 
blood  or  become  prominent,  they  create  misleading  images.  There  are  also  other  ob¬ 
jects  which  can  obstruct  the  image,  namely  the  ribs,  sternum,  spine,  the  catheter  itself 
and  the  diaphragm.  These  must  be  removed  or  bypassed  in  some  way  for  proper 
analysis. 

4)  In  any  one  view,  anomalies  of  motion  may  be  missed,  if  the  axis  of  the  abnor¬ 
mal  region  is  not  relatively  perpendicular  to  the  axis  of  the  X-ray  beam.  Thus,  one 
must  be  aware  that  one  or  even  two  perpendicular  views  may  not  tell  the  whole  story. 


The  first  problem  has  been  adequately  taken  care  of  by  medical  researchers 
and  is  documented  in  [HeintzenTl].  The  second  can  be  dealt  with  in  the  following 
manner.  The  cardiologist  doing  the  angiogram  may,  in  a  trial  and  error  manner,  find 
the  best  compromise  angles  for  taking  the  film  in  other  than  the  standard  ones. 
Secondly,  if  the  internal  model  of  the  LV  is  three  dimensional  then  a  projection  of  it 
can  be  taken  in  the  same  direction  as  the  X-rays  irradiate  the  patient’s  chest.  This 
projection  should  then  correspond  to  the  actual  image.  A  two  dimensional  model  would 
have  very  little  correspondence  to  the  LV’s  physical  characteristics;  but  nevertheless 
would  still  be  of  great  use  to  clinicians.  We  will  begin  our  work  with  a  2D  model,  and 
provide  ideas  for  how  it  can  be  extended  to  the  3D  case. 


As  for  the  third  problem,  we  will  assume  that  only  images  in  which  the  dye 
mixing  is  acceptable  will  be  analyzed  by  the  system.  However,  artifacts  produced  by 
the  components  of  the  LV  will  be  handled,  and,  obstructing  objects  will  be  recognized 
and  taken  into  consideration.  The  fourth  problem  cannot  be  dealt  with  unless  3D  im¬ 
ages  are  provided.  By  analysing  several  2D  views  it  may  be  possible  to  limit  the  chance 
for  error. 


An  additional  problem  arises  due  to  the  nature  of  left  ventricular  function. 
While  the  LV  contracts  and  expands,  the  forces  within  it  cause  it  to  move  about  or  rock 
within  the  chest  cavity.  This  is  visible  in  some  patients  as  a  distinct  pendulum  type 
motion  of  the  apex  of  the  ventricle.  In  other  patients,  it  is  not  visible  at  all. 


This  rocking  motion  may  be  recognized  and  accounted  for  in  the  following 
manner.  By  observation,  we  notice  that  the  aorta  moves  in  a  roughly  up  and  down 
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manner,  such  that  the  centroid  of  the  sinus  of  valsalva  moves  along  a  corridor.  The 
characteristics  of  this  corridor  have  not  yet  been  determined.  We  can  also  use  the  fact 
that,  because  of  the  shape  characterization,  the  apex  will  always  be  chosen  in  a  con¬ 
sistent  manner.  Now,  if  a  line  joining  the  centroid  of  the  sinus  and  the  apex  changes 
slope  significantly,  we  know  that  the  LV  has  rocked.  In  addition,  the  amount  of  rocking 
motion  to  use  to  correct  the  apparent  motion  of  the  segments  to  true  motion,  can  be 
obtained  by  the  displacement  of  this  line  along  its  length  from  image  to  image.  Obvi¬ 
ously,  this  correction  will  add  uncertainty  to  the  results,  because  the  rocking  is  not 
like  a  rigid  pendulum,  but  is  more  variable.  However,  the  result  should  be  close,  and 
will  definitely  be  much  more  correct  than  if  no  correcting  measures  were  used  for 
rocking  motion.  Fig.  3  illustrates  this. 


What  can  be  derived  from  these  films  and  how  is  the  analysis  done  by  the  car¬ 
diologists?  Well,  the  answer  to  the  second  question  is  easy.  A  cardiologist,  together 
with  his  students,  lock  themselves  up  in  a  hot,  stuffy  little  room  and  watch  the  films 
with  a  videotape  projector.  The  projector  has  forward,  reverse,  variable  speed  and  stop 
frame  features.  Their  several  years  of  experience  enables  them  to  determine  the  types 
of  wall  motion  exhibited  by  each  portion  of  the  wall.  Clearly  this  is  not  always  con¬ 
sistently  and  objectively  done.  The  type  of  learning  which  the  cardiology  resident  ex¬ 
periences  is  "learn  by  example"  or  "abstract  from  a  set  of  example  types".  Thus,  the 
abstracted  model  which  he  has  in  his  mind  is  very  difficult  for  him  to  express.  It  was 
thus  necessary  to  keep  this  in  mind  while  we  were  learning  about  LV  motion  so  that  we 
could  put  our  abstractions  down  on  paper.  This  knowledge  and  model  cannot  be  found 
in  any  cardiology  text. 


Figure  4  shows  the  form  filled  out  by  the  cardiologist  summarizing  his  findings 
on  viewing  the  film.  There  are  also  two  other  basic  measurements  taken  which  are 
commonly  used  in  diagnosis.  LV  volume,  both  end-diastolic  (largest)  and  end-systolic 
(smallest),  and  ejection-fraction,  derived  from  the  volumes.  Only  two  images  are  rou¬ 
tinely  measured  due  to  the  amount  of  work  required  by  the  human  operator.  However, 
manual  tracing  of  certain  films  in  their  entirety  is  done  when  required  and  plots  of 
volume  vs.  time  are  produced.  Volume  is  calculated  based  on  various  models  using  the 
maximum  dimension,  area  and  other  such  measurements. 


Let  us  return  to  the  form  on  Figure  4  and  discuss  some  of  its  features.  One  of 
the  first  points  that  one  notices  is  that  there  are  two  views  of  the  LV  -  the  RAO,  right  an¬ 
terior  oblique,  and  the  LAO,  left  anterior  oblique.  These  terms  simply  describe  the 
direction  to  which  the  X-rays  are  beamed.  Other  views  are  possible  because  the  ap¬ 
paratus  used  can  rotate  fully  360  degrees  around  the  patient;  however,  these  two  are 
used  as  a  compromise  between  having  the  beam  perpendicular  to  the  axis  of  a  normal 
LV  and  avoiding  structures  such  as  the  ribs,  sternum  and  spine  which  would  obstruct 
the  LV.  Secondly,  notice  that  the  LV  outline  is  divided  into  5  regions  for  the  RAO  view 
and  3  for  the  LAO  view.  Both  views  show  the  anterobasal  region,  thus  giving  a  total  of  7 
regions.  The  apical,  diaphragmatic  and  septal  regions  are  related  to  actual  physical 
components  -  the  apex,  the  diaphragm  and  the  septum.  The  others  simply  give  the 
general  location  of  the  region.  For  each  of  these  regions,  the  cardiologist  must  deter¬ 
mine  the  motion  class  which  it  exhibits.  There  are  six  classes  of  motion:  normal;  hy¬ 
pokinetic  -  the  segment  exhibits  less  than  normal  motion;  akinetic  -  the  segment  exhi¬ 
bits  no  motion;  dyskinetic  -  the  segment’s  motion  is  out  of  phase  with  respect  to  dias¬ 
tole  and  systole;  aneurysmal  -  the  segment  absorbs  energy  during  systole  and  releases 
it  during  diastole;  and,  undefined.  Now,  let  us  discuss  the  problems  with  these  classes. 

1)  Any  abnormality  does  not  necessarily  fall  entirely  within  one  of  the  pre-defined 
segments.  Thus,  if  such  an  abnormality  crosses  borders,  it  is  not  true  that  all  portions 
of  both  regions  suffer  from  the  problem. 
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2)  The  regions  are  not  defined  in  a  precise  manner.  They  depend  on  the  viewer’s 
subjective  judgement.  It  is  clear  therefore,  that  a  hypokinetic  posterobasal  segment 
for  example,  in  one  patient,  may  not  mean  the  same  thing  in  another. 

3)  The  heart  may  rock  in  the  chest  cavity  and  this  is  entirely  normal.  The  prob¬ 
lem  that  this  creates  is  that  there  is  superimposed  motion  on  the  regular  contraction- 
relaxation  pattern.  This  causes  problems  for  the  human  viewer,  as  he  must  separate 
the  two  in  order  to  get  an  objective  assessment  of  both  motions. 

4)  The  definitions  of  dyskinetic  and  aneurysmal  give  rise  to  ambiguities  for  they 
both  seem  to  define  the  same  thing. 

5)  Akinetic  motion  can  occur  in  two  ways  -  but  not  in  the  manner  in  which  the 
definition  implies.  It  is  not  likely  that  a  segment  stays  completely  stationary  while  its 
adjacent  segments  move.  Physics  would  tell  us  that  the  moving  segments  would  pull 
the  "dead"  segment  along  with  them.  Thus,  it  does  not  move  -  but  displays  an  abnor¬ 
mal  motion  pattern  -  a  "passive"  motion  pattern.  The  two  ways  in  which  dead  segments 
reveal  themselves  is  either  by  exhibiting  no  or  little  shape  change  or  by  exhibitng  a 
smaller  velocity  of  motion.  In  both  cases,  the  segment  would  lag  behind  the  motion  of 
its  neighbours. 

6)  In  light  of  this  view  of  akinesis,  one  wonders  what  hypokinesis  should  mean? 


From  the  above  comments,  it  is  clear  that  certain  decisions  (perhaps  arbi¬ 
trary)  must  be  made  concerning  the  LV  segments  and  motion  classes  which  ALVEN  will 
consider.  These  were  carefully  and  thoroughly  discussed  with  several  members  of  the 
Cardiovascular  Unit  at  Toronto  General  Hospital.  ALVEN,  at  first,  will  look  only  at  the 
RAO  projection  of  the  LV.  The  LV  wall  will  be  divided  into  3  segments:  posterior,  apical 
and  anterior.  These  segments  are  determined  in  relation  to  the  centroid  of  the  LV. 
The  centroid  is  the  point  which  is  63%  of  the  way  down  from  the  aortic  valve  edge,  along 
the  line  joining  the  apex  and  the  aortic  valve  edge  in  the  end-diastolic  image  (smallest 
volume).  This  technique  is  taken  from  [Alderman79].  The  segments  are  then  derived 
by  dividing  the  angle  formed  by  the  mitral  valve  edge  point,  the  centroid  and  the  aortic 
valve  edge  (shown  in  Figure  2)  into  3  equal  segments. 


ALVEN  considers  only  4  motion  classes:  normal,  akinetic,  hypokinetic  and 
dyskinetic.  Dyskinetic  here  means  motions  which  are  too  fast  (hyperkinetic)  or  mo¬ 
tions  with  improper  trajectories  (paradoxical). 
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2.0  RELATED  WORK 


Research  for  the  ALVEN  system  ties  together  many  distinct  areas  including 
computer  vision,  knowledge  representation  and  (knowledge-based)i  recognition  control 
structures. 


The  contributions  from  past  research  in  computer  vision  can  be  further  sub¬ 
classified.  Low  level  methodologies  such  as  Marr’s  primal  sketch  [Marr76]  and  relaxa¬ 
tion  labelling  as  a  formal  technique  for  the  computation  of  the  raw  sketch  [Zucker77] 
are  particularly  relevant.  At  a  higher  level,  the  concepts  applied  in  the  use  of  a 
knowledge  base  as  an  aid  to  recognition  as  in  SEER  [Freuder76],  [Levine77],  and  VI¬ 
SIONS  [Arbib76]  have  been  considered.  Ballard’s  rib-finder  [Ballard78]  is  most  closely 
related  to  our  work  because  it  also  uses  a  model  to  assist  in  recognition  of  ribs  in  chest 
radiographs.  Relevant  to  our  research  are  also  the  various  past  attempts  to  analyze 
cinecardioangiograms  by  the  medical  research  community,  which  indicated  the 
difficulty  of  the  problem  as  well  as  the  need  for  its  solution  [Heintzen7l]. 


The  concepts  presented  in  Minsky’s  frame  theory  [Minsky75]  and  in  semantic 
network  theory  [Brachman79]  are  relevant  for  our  representation.  Frames  are  used, 
along  with  their  basic  characteristics,  e.g.,  slots,  defaults,  slot  constraints  and  similari¬ 
ty  links,  as  our  basic  representational  unit.  Other  concepts  including  the  IS-A  and 
PART-OF  relationships  are  borrowed  from  semantic  networks  and,  in  particular,  from 
the  PSN  formalism  [Levesque79].  The  representation  of  motion  verbs  and  temporal 
concepts  is  based  on  work  described  in  [Miller72],  [Badler75]  and  [Tsotsos76].  Tem¬ 
poral  concepts  include  absolute  and  relative  time  relations  and  event  sequencing  rela¬ 
tions  for  single  motions  and  collections  of  motions.  Verbs  are  defined  using  location 
changes,  changes  of  physical  properties,  changes  of  spatial  relations  and  temporal  in¬ 
formation. 


Several  different  knowledge -based  recognition  control  concepts  have  been  in¬ 
tegrated  into  our  system,  which  is  modelled  after  the  feedback  concepts  of  control 
theory  [Zucker78b].  Similarity  conditions  are  used  [Minsky75]  to  provide  a  change  of 
attention  mechanism  while  competition  and  co-operation  among  local  processes 
[Zucker78b]  are  used  as  the  focus  of  attention  mechanism. 
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3. 0  REPRESENTA  TION  OF  KNO  WLEDCE 


3. 1  What  Needs  to  be  Represented? 


The  main  concern  of  this  representational  formalism  is  how  motion  concepts 
are  to  be  represented  and  organized  into  a  general  motion  concept  and  problem- 
specific  concept  knowledge  base.  There  are  several  types  of  motion  concepts: 

a)  speed  of  an  object 

b)  trajectory  of  motion 

c)  descriptive  motion  verbs 

d)  qualitative  trajectory  descriptions  -  directionals 

e)  changes  of  an  object’s  properties 

f)  temporal  concepts:  duration,  start  and  end  time,  relative  timing  of  events. 

In  addition  to  providing  a  coherent  scheme  for  representing  such  concepts, 
the  formalism  must  take  several  other  constraints  into  account: 

a)  The  data  involved  often  have  a  "fuzzy"  quality.  There  are  ranges  of  acceptable 
values  to  contend  with,  as  opposed  to  single  correct  values,  as  well  as  overlapping 
ranges  of  acceptable  values  between  conflicting  concepts. 

b)  The  representation  must  have  a  strong  conceptual  basis  with  qualitative  con¬ 
cepts  related  to  quantitative  ones.  This  is  because  the  knowledge  base  consists,  in  part, 
of  qualitative,  verbally-specified  knowledge,  and  also  because  it  is  to  be  used  for 
question-answering  about  its  contents  and  the  analysis  performed. 

c)  Instantiated  data  and  generic  data  must  contain  sufficient  information  for  the 
approximate  re-generation  of  the  analysed  film  or  of  generic  classes  of  motion.  In  oth¬ 
er  words,  the  descriptions  must  be  complete  enough  to  be  converted  back  to  image  se¬ 
quences. 

d)  The  formalism  must  be  capable  of  representing  concepts  at  different  levels  of 
abstraction,  depending  on  the  level  of  expertise  of  the  person  who  enters  the 
knowledge. 


These  constraints  will  not  be  specifically  discussed  in  the  following  sections. 
However,  it  should  become  clear  that  they  are  satisfied  by  the  proposed  framework. 


3.2  Representational  Building  Blocks 


The  representation  of  knowledge  used  here  has  its  roots  in  Minsky’s  Frame 
proposal  [Minsky75]  and  semantic  network  representations  [Brachman79],  particularly 


the  PSN  formalism  [Levesque?9]. 
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3.2.1  Frame  Definition 


The  basic  entities  of  the  representation  are  called  frames  and  are  used  to 
model  abstract  concepts  such  as  that  of  HEART  or  BEATing.  The  instantiations  of  the 
abstract  concepts  of  the  real  world  are  modelled  in  the  representation  by  (frame)  to¬ 
kens.  The  INSTANCE-OF  relationship  relates  tokens  to  the  frames  of  which  they  are  in¬ 
stances.  Thus  the  token  "john’s-LV",  which  represents  a  particular  left  ventricle,  is  re¬ 
lated  to  "LV"  through  the  INSTANCE-OF  relationship.  The  statement 

frame  LV  end 

simply  declares  (defines)  the  frame  "LV",  while 
token  johns-LV  instance-of  LV  end 

defines  the  token  named  "johns-LV"  as  being  an  instance  of  LV.  Every  token  must  be  an 
instance  of  at  least  one  frame. 


Each  frame  may  have  an  associated  description  of  its  internal  structure  (i.e., 
its  components  or  parts).  The  internal  structure  of  LV,  for  example,  may  include  the 
mitral  valve,  the  aorta,  the  sinus  of  valsalva  and  the  LV  wall  (Fig.  2).  The  internal  struc¬ 
ture  of  a  frame  is  defined  by  slots  each  having  a  name  and  an  associated  type  which  re¬ 
lates  it  to  another  frame  that  specifies  the  type  of  tokens  that  can  fill  (be  bound  to)  the 
slot.  For  example,  the  definition  of  "LV"  may  be  extended  to: 

frame  LV  with 

mitral-valve  :  MV  ; 
aorta  :  AORTA  ; 
sinus- of-valsalva  :  SOV  ; 

Iv-wall :  LV-WALL  ; 

end 

in  order  to  indicate  the  components  of  LV  through  four  slots.  Note  that  identifiers  in 
capital  letters  are  used  to  denote  frames  while  those  in  lower  case  letters  are  used  to 
denote  slot  names  and  tokens.  Thus, 

mitral-valve  :  MV  ; 

defines  a  slot  with  name  "mitral -valve"  and  type  "MV”.  "MY"  is  another  frame,  and  is 
defined  separately  in  the  knowledge  base. 


Instances  of  a  frame  must  have  an  internal  structure  which  conforms  to  that 
of  the  frame,  e.g., 

token  john’s-iv  instance-of  LV  udth 
mitral-valve  :  mvl  ; 
aorta  :  aol  ; 

sinus-of-valsalva  :  sovl  ; 
lv-wall :  lv-walll  ; 

end 


where  "mvl",  "aol",  "sovl”,  and  "lv-walll"  are 
"LV-WALL"  respectively. 


instances  of  "MV",  "AORTA",  "SOV",  and 


In  addition  to  a  name  and  a  type,  other  information  can  be  provided  about  a 
slot.  First,  slots  are  classified  into  two  categories  depending  on  whether  they  are  func¬ 
tionally  dependent  on  other  slots.  Prerequisite  slots  are  given  values  on  the  basis  of 
expected  data  (for  the  generic  knowledge  base)  and  tokens  of  the  corresponding 
frames  contain  the  observed  data.  Dependant  slots,  on  the  other  hand,  are  assigned 
values  that  depend  on  the  values  of  the  tokens  which  are  instances  of  the  prerequisite 
slots.  For  example,  the  definition  of  "LV"  can  be  modified  to: 

frame  LV  with 

prerequisites 

mitral-valve  :  MV  ; 
aorta  :  AORTA  ; 
sinus-of-valsalva  :  SOV  ; 
lv-wall :  LV-WALL  ; 
dependents 

area  :  AREA-VALUE  ; 
length  :  LENGTH-VALUE  ; 
width  :  LENGTH-VALUE  ; 

end 


Here,  the  prerequisite  slots  are  to  be  assigned  values  from  the  data  being 
analyzed.  Once  those  values  have  been  computed,  the  prerequistes  for  an  "LV"  have 
been  found.  The  dependent  slots  are  then  computed  from  values  of  the  prerequisite 
slots.  If  any  constraints  on  these  values  are  satisfied,  then  an  instance  of  "LV"  has 
been  recognized.  Fig.  2  shows  the  meaning  of  the  "length"  and  "width"  slots,  while 
"area"  has  a  clear  meaning. 


Slots  may  also  be  assigned  other  information  which  specifies  a  default  value, 
constraints  that  have  to  be  satisfied  by  its  values,  and  exceptions  that  are  to  be  raised 
when  these  constraints  are  not  satisfied.  For  instance,  the  definition  of  the  "LV"  frame 
can  be  further  modified  as  follows: 

frame  LV  with 

prerequisites 

mitral-valve  :  MV 
aorta  :  AORTA  ; 
sinus-of-valsalva  :  SOV  ; 
lv-wall :  LV-WALL ; 
dependents 

area  :  AREA-VALUE  with  area.  =  areaf(self)  such  that 

45  <=  area  exception  TOO-SMALL  with  subj  <-  self 
55  >=  area  exception  TOO-LARGE  with  subj  <-  self  ; 
length  :  LENGTH-VALUE  with  length  =  lengthf(self)  such  that 
10.0  <=  length  exception  T00-SH0RT  with  subj  <-  self 
11.0  >=  length  exception  T00-L0NG  with  subj  <-  self  ; 
width  :  LENGTH-VALUE  with  width  =  widthf(self)  such  that 
4.4  <=  width  exception  T00-NARR0W  with  subj  <-  self 
5.1  >=  width  exception  T00-WIDE  with  subj  <-  self  ; 

end 


The  three  dependent  slots  have  been  associated  with  constraints  and  excep¬ 
tions  that  will  be  raised  if  those  constraints  are  not  satisfied.  The  exceptions  "TOO- 
SMALL",  "TOO-LARGE",  "T00-SH0RT",  "T00-L0NG",  "T00-NARR0W",  and  "TOO-WIDE",  like 
all  exceptions  are  special  frames  that  are  instantiated  when  a  constraint  is  not 


satisfied.  Thus,  if  the  area  of  a  candidate  LV  instance  is  less  than  45  sq.cm.,  then  an  in¬ 
stance  of  the  "TOO-SMALL"  exception  frame  is  created  with  its  subject  slot  filled  with 
the  candidate  LV  instance.  The  notation  "subj  <-  seif  denotes  that  the  "subj"  slot  of 
the  exception  frame  is  to  be  bound  to  the  frame  within  which  the  exception  occurs. 


3.2.2  Primitive  Relationships 


In  addition  to  the  INSTANCE-OF  relationship,  two  other  primitive  relationships 
are  used  to  structure  a  knowledge  base  into  different  levels  of  abstraction.  The  PART- 
OF  relationship  relates  a  frame  to  its  parts,  which  are  the  types  of  its  slots.  Thus,  the 
parts  of  "LV"  are  "MV",  "SOV",  "LV-WALL",  and  "AORTA”.  Parts  may  be  decomposable, 
such  as 

frame  LV-WALL  with 
prerequisites 

anterior  :  ANT  ; 
apical :  APC  ; 

posterior  :  POST  such  that  distance(anterior, posterior)  <=  6 
exception  TOO-WIDE  with  subj  <-  self 

referent  <-  anterior 
distance(anterior, posterior)  >=  5 
exception  TOO-NARROW  with  subj  <-  self 

referent  <-  anterior 

dependents 

arclength  :  LENGTH-VALUE  with  arclength  =  anterior.arclength 
+  posterior. arclength  +  apical. arclength  such  that 
arclength  <=  110  exception  T00-L0NG  with  subj  <-  self 
arclength  >=90  exception  T00-SH0RT  with  subj  <-  self  ; 

end 

which  partitions  the  LV  wall  into  three  segments  as  shown  in  Fig.  2.  The  slot  "ar¬ 
clength"  simply  measures  the  total  arclength  of  the  LV  wall  and  is  of  course,  dependent 
on  the  other  three  slots. 

Generally,  the  PART-OF  relationship  is  a  partial  order  and  defines  a  hierarchy 
of  frames,  the  PART-OF  hierarchy,  where  frames  with  no  internal  structure  are  at  the 
bottom  and  frames  at  any  other  level  have  their  internal  structure  defined  by  frames 
one  level  down.  Clearly,  recognition  of  an  instance  of  a  frame  makes  heavy  use  of  the 
PART-OF  hierarchy  in  the  recognition  of  its  parts. 


The  other  primitive  relationship  used  to  organize  frames  is  called  IS-A.  It  is  in¬ 
tended  to  capture  the  notion  of  generalization  in  a  sense  similar  to  that  in  which  the 
concept  of  person  is  a  generalization  of  the  concepts  of  student,  employee,  and  house¬ 
wife. 


To  illustrate  the  use  of  this  relationship  in  the  organization  of  a  knowledge 
base,  consider  two  specializations  of  "LV"  defined  by: 

frame  NQRMAL-LV  is- a  LV  with 
dependents 

width  :  such  that  length  /  width  >=1.5 

exception  T00-R0UND  with  subj  <-  self 
length  /  width  <=2.5 

exception  TOO-ELONGATED  with  subj  <-  self  ; 


end 


According  to  this  definition,  "NORMAL-LV"  IS-A  "LV"  and  therefore  "NORMAL-LV”  inherits 
all  the  slots  of  "LV”  with  their  types,  defaults,  constraints  and  exceptions.  In  addition, 
because  of  the  deflation  given  above,  the  "width”  slot  has  two  extra  constraints  which 
check  the  shape  of  the  LV  under  consideration.  Note  that  if  the  constraints  of  LV  are 
not  satisfied,  the  object  under  consideration  is  not  an  LV  at  all.  If  they  are  however, 
the  object  is  an  LV,  even  though  it  may  not  be  a  normal  one,  and  the  following  frame 
may  apply: 

frame  ABNORMAL-LV  is- a  LV  with 
dependents 

width  :  such  that  length  /  width  <1.5 

or  length  /  width  >  2.5 

exception  NOT- ABNORMAL  with  subj  <-  self  ; 

end 

In  addition,  the  "width"  slot  of  "ABNORMAL-LV”  has  a  constraint  that  is  not  in  "LV"  which 
is  essentially  the  negation  of  the  new  constraints  of  the  "NORMAL-LV"  frame. 


The  IS-A  relationship  is  a  partial  order  relation  and  defines  another  hierar¬ 
chy,  the  IS-A  hierarchy,  along  which  frames  can  be  organized.  The  hierarchy  will  in 
general  be  several  levels  deep.  Thus,  below  ABNORMAL-LV  we  may  have 

frame  ROUND-LV  is- a  ABNORMAL-LV  with 
dependents 

width  :  such  that  length  /  width  <  1.5  ; 

end 

and 

frame  ELONGATED-LV  is- a  ABNORMAL-LV  with 
dependents 

width  :  such  that  length  /  width  >  2.5  ; 

end 


3,2.3  Similarity  Links 


The  reader  may  have  noticed  already  that  the  representation  scheme  in  the 
ALVEN  project  is  biased  towards  recognition.  For  instance,  the  classification  of  slots 
into  prerequisites  and  dependents  is  useful  to  a  recognizer  which  has  to  know  which 
portions  of  a  frame  definition  must  be  observed  in  the  data  and  which  portions  define 
how  to  instantiate  the  frame  concept. 

Perhaps  the  most  important  representational  feature  for  the  purposes  of 
recognition  is  the  similarity  link.  These  links  relate  frames  that  are,  in  some  sense, 
similar  and  are  used  to  handle  situations  where  one  or  more  matching  exceptions  have 
been  raised. 


Each  similarity  link  carries  two  types  of  information  (in  addition  to  its  source 
frame  and  destination  frame (s)): 

a)  A  condition  which  determines  whether  the  raised  exceptions  can  indeed  be 
handled  by  the  link;  and 
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b)  binding  information  that  indicates  how  the  values  of  the  slots  in  the  source 
frame  are  related  to  those  in  the  destination  frame. 


The  condition  in  turn  is  made  up  of  two  components,  a  similarity  expression  , 
and  a  difference  expression.  The  similarity  expression  contains  the  necessary  condi¬ 
tions  which  the  two  linked  frames  must  share  in  order  for  the  similarity  link  to  be  ac¬ 
tivated  (i.e.,  the  "similarities").  Since  only  one  of  the  frames  (the  source  frame)  is  ac¬ 
tive  when  this  condition  is  evaluated,  the  predicates  must  describe  prerequisite  com¬ 
ponents  of  the  destination  frame  that  are  also  possessed  by  the  source  frame.  The 
difference  expression  is  a  list  of  exceptions  which  may  be  raised  during  matching  of 
the  source  frame.  Each  of  the  list’s  elements  has  an  associated  time  interval  that 
specifies  when  the  exception  must  occur.  In  effect,  such  an  expression  indicates  the 
time  course  of  the  matching  failures  in  the  source  frame  context  which  indicate  that 
the  destination  frame’s  context  would  be  more  relevant,  (i.e.,  the  "time  sequence  of 
differences").  Thus,  the  similarity  links  point  to  alternate  frames  potentially  viable 
when  exceptions  are  raised. 


To  illustrate  the  use  of  similarity  links,  consider  a  modification  of  the 
"NORMAL-LV"  frame  : 

frame  NORMAL-LV  -is-a  LV  with 
prerequisites 

/*  as  before  */ 

dependents 

/*  as  before  */ 
similaiuty  links 

sim-linkl  :  ABNORMAL-LV 
with  similarities  ; 

for  differences  x:  TOO-LONG  where  subj  =  lv-wall 
bound  by  :  AORTA,  SOV,  MV  ; 

end 

The  similarity  link  defined  here  is  named  "sim-linkr',  its  destination  frame  is 
"ABNORMAL-LV",  and  it  is  applicable  when  an  instance  "x"  of  TOO-LONG  has  been  raised 
during  the  course  of  recognition  of  values  for  one  of  the  slots  of  "NORMAL-LV".  Note 
that  this  similarity  link  is  not  used  when  one  of  the  constraints  of  "NORMAL-LV"  fails, 
but  rather  when  a  constraint  of  one  of  its  parts  fails.  For  instance,  if  while  recognizing 
"LV-WALL"  the  TOO-LONG  exception  is  raised,  then  "sim-linkl"  is  applicable  and  sug¬ 
gests  an  alternate  candidate  frame.  Presumably,  this  frame  ("ABNORMAL-LV")  is  simi¬ 
lar  in  that  it  has  some  slots  which  have  identical  types  with  those  of  "NORMAL-LV"  (AOR¬ 
TA,  SOV,  MV).  However,  "ABNORMAL-LV"  must  also  be  different  in  at  least  the  sense  that 
it  doesn’t  have  an  LV-WALL  part  as  specified  by  the  LV-WALL  frame. 


The  features  of  similarity  links  as  used  in  the  representation  scheme  of  AL- 
VEN  have  their  roots  in  the  exception  handling  mechanisms  proposed  in 
[Goodenough75]  and  [Wasserman77],  but  are  closer  to  those  used  in  [Mylopoulos78]. 
Note  that  similarity  links,  as  defined  here,  strongly  depend  on  the  PART-OF  hierarchy, 
since  the  similarity  links  of  a  frame  A  are  used  to  handle  exceptions  which  are  raised 
during  the  recognition  of  the  parts  of  A. 


Similarity  links,  in  addition  to  frame  slots,  are  inherited  along  the  IS-A  hierar¬ 
chy.  This  feature  is  very  useful  because  it  drastically  reduces  the  number  of  links  that 
have  to  be  specified  explicitly  in  the  knowledge  base. 


3.3  A  Motion  Knowledge  Base 
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The  representation  scheme  described  in  the  previous  section  will  now  applied 
to  the  representation  of  motion  concepts.  Fig.  5  gives  a  global  view  of  the  frames  used 
and  their  organization  under  the  IS-A  relationship.  The  left  side  of  the  hierarchy 
describes  motion  concepts  relevant  to  single  object  movements  with  respect  to  a  fixed 
point  of  reference  (e.g.,  object  rotation).  Single  objects  are  assumed  to  be  indivisible 
and  may  represent  an  entity  or  part  of  an  entity.  Simple  motions  are  further  classified 
into  changes  of  location  (LOCATION-CHANGE)  or  changes  of  physical  properties  (PHYS- 
PROP-CHANGE).  Examples  are  shown  for  each  of  these  classes  as  well. 


The  right  hand  side  of  the  hierarchy  shown  in  Fig.  5  deals  with  aggregate  mo¬ 
tions.  Such  motions  are  further  classified  into  motions  of  objects  defined  in  terms  of 
the  motions  of  their  parts  (e.g.,  contract,  beat  ),  and  motions  involving  independent 
moving  participants  (e.g.,  lift,  pull).  The  first  category  includes  motions  with  respect 
to  an  object’s  centroid,  to  its  lateral  or  longitudinal  axes,  or  irregular  motions  with  no 
symmetry  point  or  axis.  The  second  category  includes  the  motion  classes  proposed  by 
[Miller72].  This  classification  of  motion  concepts  is  based  on  work  described  in  [Tsot- 
sos76]. 


To  define  some  of  the  relevant  frames,  consider  first: 


frame  TIME-INTERVAL  with 
'prerequisites 

start-time  :  NUMBER  such  that  start-time  >=  0  ; 
end-time  :  NUMBER  sxLch  that  end-time  >=  start-time  ; 
dependents 

duration  :  NUMBER  with 

duration  <-  end-time  -  start-time; 


end 


Frames  for  "speed"  and  for  "trajectory"  are  also  essential  for  defining  motion 
frames  and  they  are  defined  in  analogous  ways. 


A  typical  example  of  a  motion  frame  is  that  of  "CONTRACT”.  To  define  it,  we 
first  need  to  define  the  frame  "AREA"  which  keeps  track  of  the  area  of  objects  as  a 
function  of  time: 

frame  AREA  with 
prerequisites 

subj  :  OBJECT  ; 
ti  :  TIME-INTERVAL  ; 
start-a  :  AREA-VALUE  ; 
end-a  :  AREA-VALUE  ; 

end 

"CONTRACT"  can  then  be  defined  as  follows  : 

frame  CONTRACT  is-a  AREA-CHANGE  with 
prerequisites 

subj  :  CONTRACTILE-OBJECT  ; 
t  :  TIME-INTERVAL  such  that 

for  al  :  AREA  where  al.subj  =  self.subj 

al.ti.start-time  =  self.t.  start-time 
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Figure  5.  Global  View  of  the  Motion  IS-A  Hierarchy 


end 


al.ti.  end-time  =  self.  t.  end-time 
al.start-a  >  al.end-a  ; 

dependents 

area-change  :  AREA-VALUE  with 

area-change  <-  al.start-a  -  al.end-a  ; 


The  for  construct  allows  the  identification  of  an  instance  of  AREA  with  given 
values  for  its  "subj"  and  ”ti"  slots.  The  token’s  area  slots  are  then  compared  to  ensure 
that  there  is  really  a  contraction.  It  should  be  clear  from  this  example  that  any  func¬ 
tion  (ECG  signals,  LV  pressure  curves,  velocity  profiles)  can  be  represented  in  this 
manner  (piecewise  linear  approximation).  By  using  other  frames,  similar  to  the  CON¬ 
TRACT  frame,  significant  occurrances  in  the  function  can  be  detected  and  labeled  for 
later  use  in  the  description  derived  by  the  system. 


As  another  example,  consider  the  defintion  of  "TOWARDS”: 

frame  TOWARDS  is- a  OBJECTIVE-DIRECTIONAL  with 
prerequisites 

direction  :  ANGLE  such  that 

arctan( direction)  -  arctan((referent.loc.y  -  subj.loc.y)/ 
referent.loc.x  -  subj.loc.y))  /  arctan(direction)  <=  0.1 

end 


It  is  assumed  here  that  every  physical  object  has  a  ’Toe"  slot  whose  value  is  a  location 
with  an  "x"  and  "y"  component.  Moreover,  "TOWARDS"  inherits  from  "OBJECTIVE- 
DIRECTIONAL,  "speed",  "subject"  and  "referent"  slots  which  are  used  to  specify  the  con¬ 
straints  of  "TOWARDS".  According  to  these  constraints,  the  subject  of  "TOWARDS"  must 
be  moving  and  the  direction  of  its  motion  must  be  within  a  10%  cone  around  the  direc¬ 
tion  of  the  vector  defined  by  the  referent  and  subject  objects. 


3.4  An  Overview  of  Left  Ventricular  Motion  Concepts 


As  mentioned  earlier,  the  outline  of  the  left  ventricle  will  be  described  by  par¬ 
titioning  the  outline  into  3  segments:  posterior,  apical  and  anterior  (when  viewed  in 
the  Right  Anterior  Oblique  projection).  These  segments  form  the  parts  of  the  LV’s  ob¬ 
ject  frame.  The  constraints  specified  in  the  frame  are  mostly  spatial  and  are  time- 
invariant. 


LV  motions  (cycles),  both  normal  and  abnormal,  fall  under  aggregate  motions 
in  the  IS-A  hierarchy  (see  Fig.  6  for  the  frames  which  fall  below  aggregate  motions  and 
which  apply  to  LV  cycles).  While  the  form  of  such  motions  is  not  known  precisely,  there 
are  some  subjective  data  for  structuring  an  initial  version  of  the  knowledge  base.  In 
particular,  there  are  7  distinct  phases  to  a  left  ventricular  beat  (assuming  no  arrhyth¬ 
mias)  [Silber75],  for  which  approximate  timing  relationships  are  known.  In  addition, 
some  information  is  available  about  the  expected  velocities  of  LV  segments.  An  impor¬ 
tant  component  of  ALVEN  is  a  data  collection  system  whose  purpose  is  to  determine 
this  information  more  precisely,  thus  allowing  an  iterative  refinement  of  the  knowledge 
base. 


Using  such  information,  a  sketch  of  the  knowledge  base  can  be  constructed. 
There  is  a  separate  motion  frame  for  each  phase.  Some  phases  define  contraction  of 
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the  LV  (maximum  ejection,  reduced  ejection),  others  expansion  (rapid  inficw,  diastasis, 
filling  by  atrial  contraction),  and  two  define  a  state  of  rest  (isovolumic  contraction,  Iso- 
volumic  relaxation  -  isovolumic  is  the  key  here).  Therefore  the  phase  frames  are  IS-A 
linked  to  the  appropriate  frames  for  "contract"  and  "expand".  In  each  frame,  all  of  the 
segments  of  the  LV  are  associated  with  the  event  or  sequence  of  events  that  define  the 
segment’s  motions  during  that  phase.  Correspondingly,  the  frame  for  "normal  LV  beat" 
would  be  made  up  of  the  events  which  represent  each  phase  of  the  LV,  linked  in  a  se¬ 
quence  with  appropriate  timing  information  so  that  together  they  define  the  beat.  The 
frame  for  the  normal  left  ventricular  cycle  is  given  below. 


frame  NORMAL-LV-CYCLE  is- a  CYCLE  with 
prerequisites 

n-i-r  :  N-ISOVOLUMIC-RELAXATION  such  that 
ti. end-time  =  n-dias.ti. start-time  ; 
n-dias  :  N-DIASTOLE  such  that 

ti.start-time  =  n-i-r.  ti.  end-time 
ti.end-time  =  n-i-c.  ti.start-time; 
n-i-c  :  N-ISOVOLUMIC-CONTRACTION  such  that 
ti.start-time  =  n-dias.  ti.end-time 
ti.end-time  =  n-sys.  ti.start-time; 
n-sys  :  N-SYSTOLE  such  that 

ti.start-time  =  n-i-c.  ti.end-time; 

dependents 

subj  :  NORMAL-LV  ; 
ti  :  TIME-INTERVAL  with 


end 


start- time  <-  n-i-r. ti.start-time 
end-time  <-  n-sys. ti.end-time 
duration  <-  default  0.8  ,  such  that 

duration  >0.7  exception  TOO-SHORT-DUR 
with  subj  <-  self 

duration  <0.9  exception  T00-L0NG-DUR 
with  subj  <-  self  ; 


This  frame  clearly  does  not  contain  all  the  information  that  is  necessary  to 
define  a  normal  LV  cycle.  Its  parts  are  further  decomposed  according  to  the  hierarchy 
shown  in  Fig.  7  and  frames  are  defined  for  each  of  those  concepts  as  well. 


An  example  of  a  similarity  link,  for  which  the  data  was  found  in  a  cardiology 
journal  [Sanderson77],  and  used  directly,  is  that  which  relates  the  normal  left  ventricu¬ 
lar  cycle  to  that  of  a  hypertrophic  ventricle  (hypertrophic  cardiomyopathy).  This  link, 
shown  below,  would  appear  in  the  "NORMAL-LV-CYCLE"  frame. 

similarity  links 

sim-link5  :  HYPERTROPHIC-CARDIOMYOPATHY 
with  similarities 

subj. width  =  he. subj. width  at  n-i-c. ti.start-time  ; 
for  differences 

x  :  T00-SH0RT  where  subj  =  self. width 

t  =  n-i-r.  ti 

y  :  T00-L0NG-DUR  where  subj  =  self 

t  =  n-i-r.  ti 

z  :  T00-L0NG-DUR  where  subj  =  self 

t  -  n-r-f.ti ; 

bound  by 

N-ISOVOLUMIC-CONTRACTION 
N-SYSTOLE  ; 
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Figure  7.  Portion  of  the  LV  Motion  PART-OF  Hierarchy 
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The  values  for  determination  of  when  these  exceptions  occur  are  also  given  in 
that  paper  and  would  be  included  in  the  appropriate  slot  descriptions.  The  TOO-SHORT 
exception  is  reused  for  LV  width  (length  of  the  lateral  axis)  if  its  dimension  falls 
between  2.0  and  2.8  cm.  The  normal  values  are  3.3  and  4.1  cm.  at  the  start  of  the  isovo- 
lumic  relaxation  phase.  The  T00-L0NG-DUR  exception  (that  is,  duration  is  too  long  for 
an  event)  is  raised,  in  the  first  case  if  the  isovolumic  relaxation  phase’s  duration  is 
between  100  and  180  msec.  The  normal  values  are  79  to  107  msee.  In  the  second  case, 
it  is  raised  if  the  duration  is  between  96  and  224  msec,  while  the  normal  values  are  110 
to  150  msec.  Note  that  in  the  first  T00-L0NG-DUR  exception  ("y"),  the  values  overlap 
with  the  normal  case.  This  means  that  both  frames  would  be  active  and  subsequent 
data  would  be  used  to  choose  the  better  of  the  two. 


There  are  several  additional  domain  related  considerations  which  must  play  a 
part  in  the  definition  of  the  knowledge  base. 


a)  LV  orientation  and  size  -  Orientation  is  not  important  because  all  descrip¬ 
tions  are  relative  to  the  LV  and  its  shape  feature  axes.  Size,  however,  is  important  be¬ 
cause  different  size  ventricles  exhibit  different  motion  characteristics.  In  addition, 
thick  walled  and  thin  walled  ventricles  exhibit  very  different  motions.  Thus,  these 
different  types  require  different  motion  frames.  The  LV  size  and  wall  thickness  will  be 
input  to  the  system  before  analysis.  This  information  can  be  used  so  that  only  the 
relevant  portions  of  the  KB  are  initially  considered. 


b)  Rocking  Motion  -  In  addition  to  the  contraction-relaxation  motion  patterns 
of  the  left  ventricular  wall,  the  heart  may  also  "rock"  or  "sway”  within  the  chest  cavity. 
A  frame  to  define  rocking  motion  may  be  set  up  using  the  following  characterization. 
The  motion  involves  rotation  of  the  LV's  length  axis  about  some  point  in  the  sinus  of  val¬ 
salva.  The  rotation  must  be  cyclic  and  such  a  motion  is  the  defintion  of  the  verb 
"SWAY'1.  However,  the  fact  that  it  is  superimposed  onto  the  LV  contraction  and  expan¬ 
sion  pattern  must  be  taken  care  of.  This  may  be  done  by  including  rocking  in  the  LV- 
cycle  frame  and  distinguishing  between  the  two  frames  by  naming  them  ''LV-CYCLE" 
and  "LV-CYCLE-WITH-ROCKING".  The  rocking  seems  to  have  the  same  cycle  time  as  the 
beat.  Thus,  if  the  system  has  hypothesized  a  rocking  motion  of  the  LV,  it  can  be  added 
to  the  regular  motion  before  comparing  it  to  the  description  of  the  observed  LV.  Then 
it  may  be  subtracted  out.  In  general,  if  a  simple  motion  is  superimposed  onto  an  ag¬ 
gregate  motion,  the  motion  frame  will  contsdn  the  events  for  the  object’s  parts  and 
events  for  the  superimposed  motions.  This  may  be  a  superimposed  translation  or  rota¬ 
tion,  or  a  change  of  physical  properties  such  as  a  change  of  area.  The  information  in 
this  last  case  would  be  redundant,  because  it  is  the  result  of  the  sum  of  the  motions  of 
the  object’s  parts. 


c)  Abnormality  Regions  -  The  characteristics  of  regions  of  the  LV  wall  which 
exhibit  abnormal  motion  patterns,  regardless  of  the  type  of  abnormality  (akinesis, 
dyskinesis,  hyperkinesis,  etc.),  are  also  stored  in  the  knowledge  base.  Abnormality  re¬ 
gions  do  not  always  fall  wholly  within  one  segment  and  if  they  span  two  or  more  seg¬ 
ments,  their  characteristics  may  be  different.  However,  these  characteristics  may  be 
defined  for  each  abnormality  relative  to  its  position  on  the  LV  wall.  In  addition,  an  ab¬ 
normality  spanning  two  regions  has  different  properties  again.  The  boundary  condi¬ 
tions  between  abnormal  and  normal  tissue  may  differ  depending  on  size  and  location  of 
abnormality.  Therefore,  it  has  been  found  necessary  to  have  separate  frames  for  each 
of  these  abnormality  types  to  ensure  that  the  above  considerations  are  taken  into  ac¬ 
count. 


2  4 


4.0  A  KNOWLEDGE-BASED  FEEDBACK  MODEL  FOR  ANALYSIS  OF 

TIME-VARYING  IMAGERY 


The  hypothesize-and-test  paradigm  is  both  commonly  used  and  very  impor¬ 
tant  to  AI.  However,  the  determination  of  which  hypotheses  to  test  remains  a  difficult 
problem.  This  section  presents  a  framework  for  integration  of  this  paradigm  with  the 
concepts  of  competition  and  co-operation  and  feedback.  The  structure  is  dependent 
on  a  knowledge  base  whose  organizational  concepts  are  as  described  in  the  previous 
section.  This  unified  structure  provides  a  mechanism,  for  analysis  of  time-varying  im¬ 
agery. 


4. 1  Motivation 


In  [Zueker78b],  Zucker  discusses  the  possible  use  of  concepts  from  control 
theory  and  in  particular,  feedback  systems  for  pruning  the  active  hypothesis  set  dur¬ 
ing  recognition  using  the  hypothesize-and-test  paradigm.  However,  the  exact  relations 
between  these  concepts  and  how  they  may  be  incorporated  into  a  unified  knowledge- 
based  recognition  system  are  not  worked  out.  In  order  to  get  a  better  grasp  of  feed¬ 
back  as  defined  in  control  theory,  it  is  useful  to  define  the  components  of  a  feedback 
system  in  more  abstract  terms  than  usual. 


Consider  the  configuration  of  a  standard  feedback  system  shown  in  Fig.  8. 
(from  [Zucker78b]).  Note  that  the  system  really  consists  of  five  mechanisms: 

1)  a  mechanism  to  generate  the  system’s  output  in  the  terms  of  the  problem 
domain  ("process”), 

2)  a  mechanism  to  provide  input  from  the  problem  domain  with  respect  to  the 
output  in  terms  the  system  would  understand  ("feedback  transducer"). 

3)  a  mechanism  to  detect  whether  or  not  there  is  a  difference  between  the  data 
from  the  problem  domain  and  system’s  expected  version  of  the  data  ("summation  of 
reference  and  feedback  signals"), 

4)  a  mechanism  to  determine  the  compensatory  action  if  differences  are  detect¬ 
ed  ("controller"), 

5)  a  mechanism  to  generate  a  reference  signal  which  represents  the  system’s 
desired  state  for  the  data  ("ref"). 


In  such  a  process,  any  deviation  from  the  required  output  by  the  data  from 
the  problem  domain  due  to  noise  or  errors,  causes  a  transient  response  until  a  new 
stable  state  is  reached.  In  particular,  feedback  may  be  "negative"  in  which  case  it 
helps  the  system  dampen  or  stablilize  its  transient  output,  or  it  may  be  "positive"  ,  in 
which  case  it  helps  the  system  change  its  output  away  from  the  current  output.  The 
system  output  is  unstable  in  cases  where  only  positive  feedback  is  present. 
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Figure  8. 
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Figure  9. 


4.2  Overview 


The  above  discussion  is  all  very  nice,  but  how  does  it  help  recognition?  Feed¬ 
back  systems  contain  many  components  which  are  used  in  recognition:  indeed,  the  en¬ 
tire  recognition  control  structure  may  be  described  conceptually  using  just  such  a 
model  with  a  couple  of  modifications. 


The  model  for  recognition  of  time-varying  phenomena  is  shown  in  Fig.  9.  It 
does  not  stand  alone.  The  only  reason  that  such  a  structure  is  possible  and  is  concep¬ 
tually  pleasing  is  that  the  knowledge  base  is  organized  in  an  appropriate  manner.  In¬ 
formation  such  as  similarities  between  hypotheses  and  IS-A  relationships  and  the  fact 
that  frames  can  be  used  to  produce  expectations  must  somehow  be  taken  advantage  of 
during  recognition.  The  structure  shown  in  Fig.  9  does  exactly  that. 


A  model  for  recognition  of  motion  concepts  from  a  sequence  of  images  must 
contain  components  which  perform  the  following  tasks: 

a)  temporal  feedback  -  Data  derived  from  the  images  must  be  related  back  to  the 
system's  current  beliefs  about  what  the  images  contain. 

b)  change  and  focus  of  attention  -  This  may  be  considered  from  two  points  of 
view.  The  system  must  be  able  to  realize  which  objects  are  moving  and  which  are  not 
so  that  time  is  not  wasted  on  computations  involving  motionless  objects.  In  addition, 
the  system  must  be  able  to  "change  its  mind"  about  what  it  believes  to  be  occuring  in 
the  images  and  to  formulate  a  new  "best  idea". 

c)  expectation-guided  determination  of  data  -  From  a  computational  point  of 
view,  the  task  of  completely  analysing  a  long  sequence  of  images  is  horrendous.  More¬ 
over,  it  is  not  necessary  since  a  great  deal  is  known  about  what  types  of  motions  to  ex¬ 
pect  from,  a  particular  class  of  objects  from  image  to  image. 


It  should  be  clear  that  these  tasks  are  not  possible  (or  extremely  difficult)  for 
a  general  class  of  arbitrary  image  sequences.  The  model  presented  in  this  section 
makes  heavy  use  of  a  knowledge  base  of  general  motion  concepts  and  problem-specific 
motion  concepts.  In  other  words,  the  system  "knows"  quite  well  what  kinds  of  motions 
the  objects  in  its  problem  domain  can  exhibit. 


The  components  of  ALVEN’s  version  of  this  model  are  now  described  in  detail. 
The  control  structure  is  shown  in  Fig.  10  and  should  be  referred  to  whenever  nesessary 
during  the  following  discussion.  It  should  become  clear  how  the  control  theory  con¬ 
cepts  relate  to  the  components  of  ALVEN. 


4.3  Image  Analysis  -  Extracting  the  Data 


4.3.1  Computing  the  Essential  Trace 


The  first  step  in  image  analysis  is  the  determination  of  the  LV  outline.  The 
general  way  for  doing  this  is  to  use  an  outline  prediction  in  order  to  find  the  outline  and 
then  to  compute  additional  information  about  the  outline.  This  information  is  termed 
the  "essential  trace"  of  the  LV. 
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The  computation  of  the  essential  trace  is  done  in  two  steps.  The  first  is  the 
determination  of  the  edges  of  the  object  in  the  image,  and  may  be  termed  the  compu¬ 
tation  of  the  raw  essential  trace.  Because  of  the  poor  quality  of  these  images,  such 
edges  are  often  difficult  (or  impossible)  to  find.  Thus  the  system’s  expectations  are 
used  to  limit  the  area  over  which  edges  may  be  found,  and  also  to  bias  the  system  to¬ 
ward  edge  segments  with  certain  orientations.  Objects  which  the  system  considers  as 
"fixed"  (cannot  move  and  assuming  no  observer  motion),  have  no  expectations  associ¬ 
ated  with  them  from  image  to  image.  Those  objects  which  can  move  ("mobile"),  even  if 
no  hypothesized  motions  have  been  associated  with  them,  always  have  expectations  for 
each  image.  When  the  expectations  do  not  agree  with  reality,  the  true  edges  are  found 
by  relaxing  the  constraints  on  the  expectations.  The  second  step  is  an  analysis  of  the 
raw  trace  in  which  descriptors  are  attached  to  the  elements  of  the  trace.  These 
descriptors  are  shape  features,  and  size  and  location  information.  The  final  essential 
trace  thus  consists  of  the  raw  edge  elements  along  with  their  associated  descriptors. 

Clearly  the  first  image  is  a  special  case,  because  no  high  level  expectations 
are  possible.  The  model  assumes  that  a  conceptual  description  is  given  for  the  first  im¬ 
age  and  this  may  be  either  provided  by  the  user  or  computed  by  the  methods  available 
for  single  image  analysis. 

4.3.1 .1  The  Raw  Trace 


The  method  for  determining  the  raw  trace  follows  the  relaxation  labelling  pro¬ 
cess  outlined  in  [Zucker77],  [Zucker78a].  Modifications  to  the  process  are  necessary, 
however,  because  predictions  for  edge  location  are  provided  by  the  higher  levels  of  the 
system.  These  predictions  enable  the  process  to  be  much  more  efficient  than  without 
guidance,  and  also  provide  part  of  the  feedback  necessary  for  the  motion  recognition 
algorithms. 

The  input  to  the  first  phase  of  the  process  is  simply  the  list  of  connected  edge 
segments  that  form  the  outline  of  the  object  of  interest.  Associated  with  each  seg¬ 
ment,  is  a  region  radius.  This  radius  defines  the  extent  of  the  search  space  in  the  im¬ 
age  for  that  segment.  Or,  in  other  words,  it  defines  the  sub-image  over  which  relaxa¬ 
tion  is  to  be  performed.  In  addition,  a  starting  point  on  the  outline  and  a  possibly 
varying  search  length  are  provided.  For  closed  loop  outlines,  this  starting  point  is  not 
crucial  as  long  as  it  is  consistently  determined.  For  left  ventricles,  which  have  open 
loop  outlines,  the  left  upper  point  of  the  aorta  is  used  as  the  starting  point.  The  length 
of  outline  to  search  for  at  any  one  instant  depends  on  the  confidence  of  the  prediction: 
the  smaller  the  confidence  of  the  actual  edge’s  location,  the  shorter  the  search  length. 

Beginning  with  the  starting  point,  the  specified  length  of  predicted  outline  is 
taken  and  a  region  of  specified  radius  is  created  for  it,  Each  point  in  the  region  is  la¬ 
belled  with  the  edge  orientation  of  the  portion  of  the  outline  closest  to  it.  Next,  for 
each  point  in  the  region,  a  5  X  5  gradient  edge  operator  is  applied  in  each  of  the  8  pos¬ 
sible  edge  orientations,  as  well  as  a  "no-edge1'  operator.  The  no-edge  operator  is  basi¬ 
cally  a  detector  of  variance  among  the  responses  for  the  eight  edge  orientations.  Its 
response  is  high  if  there  is  little  difference  among  these  responses  and  low  if  there  is  a 
peak  in  the  response  function.  The  responses  of  the  detectors  are  associated  with  the 
point  as  certainties  and  are  supplied  to  a  prediction  guided  relaxation  labelling  process 
for  determination  of  the  object’s  edge. 


Predictions  enter  the  relaxation  process  in  two  ways:  the  search  region 
(described  in  the  previous  paragraphs)  and  the  expected  edge  orientation.  Relaxation 


is  only  applied  to  the  points  for  which  the  gradient  operator  has  produced  responses, 
i.e.,  in  the  search  region.  The  expected  orientation  is  included  in  the  computation  by 
partitioning  search  regions  into  mosaics  of  sub-regions,  and  then  associating  different 
orientation  biases  with  each  of  these  sub-regions.  These  biases  cause  the  relaxation 
updating  function  to  favour  edge  labels  of  the  expected  orientation  (and  those  immedi¬ 
ately  compatible  with  that  orientation).  Such  information  is  coded  into  a  "strength" 
function.  Suppose  the  expected  edge  orientation  were  "4".  The  corresponding  strength 
function  for  all  the  possible  edge  labels  (excluding  the  no-edge  label)  would  be: 

label  01234  567 

s(4,l)  0.2  0.4  0.8  1.0  1.0  1.0  0.8  0.4 


Thus  edges  that  are  oriented  in  approximately  the  same  direction  as  the  ex¬ 
pectation  are  favoured,  while  orthogonal  ones  are  biased  against.  The  relaxation  up¬ 
dating  function  in  [Zucker77]  is  re-defined  to  include  the  above  orientation  biases  as 
follows: 

p’(i.l)  =  p(i.l)  *  [1  +  q(l)]  *  s(d,l)  /  norm 
norm  =  sum  over  labels  p(i,l)  *  [l+q(l)]  *  s(d,l) 

where  s(d,l)  is  the  strength  function.  The  parameters  of  s  are  the  predicted  direction  d 
and  the  label  1. 


It  is  possible  for  obstructions  such  as  ribs  or  the  diaphragm  to  enter  into  the 
region  under  analysis.  Once  these  obstructions  have  been  found  and  identified,  (from 
the  conceptual  description  of  the  first  image  that  is  provided),  their  edge  orientations 
can  be  computed  and  corresponding  values  for  the  strength  function  reduced  at  those 
points.  Thus,  these  objects  do  not  interfere  with  the  essential  trace  computation. 

When  the  relaxation  process  has  terminated,  the  established  edge  labels  are 
linked,  into  a  segment  of  the  outline,  beginning  with  the  specified  starting  point.  The 
segment  must  then  be  verified  as  a  legitimate  part  of  the  outline.  If  a  continuous  (no 
gaps)  edge  entirely  within  the  region  is  found,  it  is  accepted.  If  the  edge  has  gaps,  and 
the  gaps  can  be  filled  by  inserting  edge  elements  that  a)  satisfy  good  edge  continuation 
criteria,  and  b)  are  completely  within  the  region,  then  the  section  is  corrected  and  ac¬ 
cepted.  Such  "filling  in"  processes  are  useful  for  poor  quality  images,  i.e.,  those  arising 
from  poor  dye  mixing.  In  other  cases,  a  check  is  first  made  to  see  if  any  of  the  ob¬ 
structing  structures  outlined  in  the  first  image  lie  in  the  current  search  region.  If  so, 
their  location  is  passed  down  to  the  relaxation  procedure  so  that  the  strength  function 
may  be  modified  and  the  process  is  repeated.  Otherwise,  the  region  size  is  increased 
on  the  assumption  that  the  prediction  did  not  fortell  the  observed  circumstances.  (The 
hypotheses  did  not  include  the  one  which  expected  the  motion  which  was  actually  ob¬ 
served).  Finally,  if  no  outline  can  be  found,  it  can  be  taken  as  the  predicted  outline. 


When  an  outline  segment  has  been  verified,  the  next  one  is  considered.  It  is 
overlapped  with  the  previous  one  by  at  least  one  point,  which  has  certainty  1.0  for  its 
appropriate  edge  label.  This  anchoring  provides  a  guarantee  of  good  continuation 
between  segments,  but  it  is  attenuated  over  distance.  Thus,  the  segments  should  be 
kept  short. 


An  example  of  the  prediction-guided  relaxation  labelling  process  is  shown  in 
Fig.  11.  The  prediction  window  is  shown  (a),  along  with  the  edge  operator’s  best  edge 
output  (b),  and  3  iterations  of  the  relaxation  process  (c),  (d),  and  (e).  Note  that  the 
two  edge  segments  in  (b)  which  are  orthogonal  to  the  true  edge,  are  almost  completely 


(e) 
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removed  and  replaced  by  true  edge  segments  by  the  third  iteration  (c). 


4.3.1 .2  Descriptors  for  the  Raw  Trace 


The  raw  outline  is  used  to  determine  shape  information.  Shape  features  are 
basically  convex  and  concave  sections.  These  are  then  sub-classified  into  particular 
types  of  concavity  and  convexity  using  descriptive  terms  from  geography.  These  terms 
were  taken  from  [Moore54]  and  modified  to  disambiguate  and  precisely  define  them. 
Examples  of  the  2D  shape  features  are  lakes,  islands,  bays,  capes,  inlets,  coves,  fiords, 
lagoons  of  all  types,  concave  and  convex  points,  peninsulae,  hooks,  etc.  Each  has  asso¬ 
ciated  descriptors:  centroid,  area,  arclength,  longitudinal  axis,  lateral  axis,  and  max¬ 
imum  interior  dimension. 


The  algorithm  for  the  determination  of  bays  and  capes  is  straightforward. 
The  outline  is  traversed  in  a  counter-clockwise  direction  and  significant  left  and  right 
turns  detected.  A  bay  is  defined  as  a  right  turn,  followed  by  a  sequence  of  straight  seg¬ 
ments  and  left  turns,  and  ending  in  another  right  turn.  Similarly,  a  cape  is  a  left  turn, 
followed  by  straight  segments  and  right  turns  and  ending  with  a  left  turn.  Relative 
lengths  of  axes  and  maximum  interior  dimensions  then  are  used  to  further  classify  the 
shape  type.  For  example,  a  fiord  is  a  bay  whose  lateral  axis  is  much  longer  than  its 
longitudinal  axis,  and  this  same  definition  applies  for  a  peninsula  as  a  sub-class  of 
capes.  Smoothing  of  the  outline  is  accomplished  by  running  the  shape  determination 
algorithm  on  the  rough  outline,  and  then  removing  any  shape  features  (replacing  them 
with  straight  line  segments)  whose  arclength  is  smaller  than  some  system  resolution. 


Shape  information  may  be  hierarchically  described  if  the  domain  requires  the 
added  level  of  complexity  for  the  shapes  it  includes.  In  arriving  at  the  shape  hierar¬ 
chies,  small  shape  features  are  smoothed  out  and  the  shape  analysis  is  run  again.  Posi¬ 
tional  information  of  the  shape  features  in  both  gross  and  detail  levels  of  description 
then  determines  which  detail  features  are  encompassed  by  a  gross  feature. 


4.3.2  Computing  the  Essential  Kineses 


Once  the  essential  traces  of  an  object  for  a  pair  of  consecutive  images  have 
been  found,  they  are  compared  and  the  differences  analyzed.  The  resulting  description 
of  their  differences  is  termed  the  "essential  kineses"  for  the  object  for  that  time  inter¬ 
val. 


Computation  of  the  essential  kineses  for  an  object  involves  comparison  of  two 
consecutive  essential  traces  for  the  object.  Motion  is  detected  for  the  object  if  the  lo¬ 
cations  of  any  of  its  parts  differ  between  the  two  traces.  Correlation  of  moving  shape 
features  between  the  traces  is  done  in  a  heuristic  manner.  Using  the  correlation  infor¬ 
mation,  values  are  determined  for  the  shape  transformation  predicates  "STRETCH", 
"DISTORT",  "TRANSLATE",  and  "ROTATE",  as  well  as  for  the  local  shape  group  changes 
"WARP",  "BUCKLE",  and  "STRAIGHTEN".  The  essential  kineses  for  an  object  are  the 
transformations  of  its  physical  properties. 


4.3.2. 1  Correlation  of  Shape  Features 


For  rigid  objects,  the  correspondence  between  components  in  two  successive 
images  can  be  computed  by  minimizing  global  distance  criteria  [Uliman77].  However, 
the  problem  domain  of  LV  wall  motion  involves  non-rigid  object  motion,  thus  making 
the  correspondence  much  more  involved.  Fig.  12  contains  an  example  of  a  set  of  su¬ 
perimposed  LV  outlines  that  indicate  the  types  of  motions  involved.  In  particular, 
problems  arise  because: 

1)  The  number  of  shape  features  along  the  wall  is  not  constant.  There  is  a  small 
number  (  10  -  20  )  in  each  outline  and  it  may  vary  by  a  few  from  image  to  image. 

2)  Shape  features  may  protrude  from  other  larger  features  and  may  also  blend 
into  other  larger  features  between  images. 

3)  Continuity  criteria  must  be  satisfied  at  the  boundaries  between  segments. 

4)  Shape  features  do  not  move  rigidly  -  they  stretch,  distort,  translate,  and  rotate 

(in  2D). 


In  order  to  cope  with  these  different  problems,  several  different  kinds  of  in¬ 
formation  must  be  used.  Furthermore,  since  the  problems  exist  at  different  represen¬ 
tational  levels,  we  have  been  unable  to  formulate  a  single  criterion  that  encompasses 
all  of  them.  Rather,  we  have  found  that  the  following  set  of  matching  heuristics  works 
well  in  practice.  These  heuristics,  together  with  their  underlying  rationale,  are  given 
below. 


1)  2D  distance  is  clearly  an  important  criterion  for  non-rigid  object  motion  as  well 
as  for  rigid  object  motion.  However,  relative  positions  of  start  and  end  points  of  the 
shape  features  along  the  outline  has  turned  out,  in  practice,  to  be  much  more  power¬ 
ful.  Both  are  used. 

2)  Between  images,  no  point  on  the  wall  moves  very  far.  The  data  collection  sys¬ 
tem  we  have  implemented  [Delgrande79]  is  being  used  to  determine  values  for  the 
average  amounts  of  motion  for  the  different  wall  segments  in  normal  and  abnormal  si¬ 
tuations. 

3)  Similarity  of  shape  type  is  a  different  type  of  heuristic.  Shape  types  seem  to 
remain  constant,  with  "continuous"  changes  between  types.  For  example,  a  deep  bay 
cannot  become  a  deep  peninsula  between  images.  It  must  first  become  a  small  bay,  a 
straight  section,  a  small  cape,  and  then  a  peninsula.  Also,  shapes  which  protrude  or 
are  absorbed  from  one  image  to  the  next  usually  have  correspondence  mappings  in 
which  one  shape  becomes  three  (it  "buckles")  or  vice  versa.  Explicit  knowledge  of  such 
transitions  greatly  facilitates  matchings  between  new  shape  features  and  current  ones. 
Other  more  complex  mappings  seem  to  be  much  less  frequent,  which  makes  the 
embedding  of  these  transitions  into  matching  heuristics  very  useful . 

4)  Complex  shape  feature  matchings  as  described  in  3)  are  dependent  on  shape 
arclength  and  relative  positions  of  the  start  and  end  points  of  the  shape  along  the  out¬ 
line. 


5)  Since  shape  types  overlap,  (a  bay  and  an  adjacent  cape  have  common  seg¬ 
ments),  matching  of  one  of  these  features  leads  to  partial  matches  for  the  adjacent 
overlapping  ones. 

6)  If  more  than  one  level  of  shape  description  is  present,  the  matching  can 
proceed  in  a  top-down  manner,  from  the  more  gross  features  to  the  more  detailed 
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NOTE:  Three  examples  of  LV  wall  motion 

patterns  are  shown.  They  are  all  taken  in  the  RAO 
(Right  Anterior  Oblique)  projection.  Motion  of  individual  points 
on  the  LV  wall  can  also  be  seen.  These  motions  are  derived  from 
implanted  myocardial  screws. 

Both  (a)  and  (b)  are  patterns  from  patients  whose  LV’s  are  functioning 
normally,  (b)  shows  the  "classic"  normal  contraction  pattern,  while 
(a)  shows  a  different,  but  functionally  normal  pattern,  (c)  shows 
the  pattern  of  a  LV  with  mitral  stenosis. 


Figure  12  Examples  of  LV  Kail  Motion 


features. 
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There  is  no  need  to  consider  the  possible  start  points  for  the  matching,  be¬ 
cause  all  traces  are  structured  in  the  same  manner  (they  are  all  derived  from  the 
predictions).  An  example  of  the  result  of  these  correspondencies  is  shown  in  Fig  13. 
Two  consecutive  outlines  are  shown,  along  with  the  shape  features  for  each,  and  the 
correlation  between  them  determined  by  the  heuristics  described  above. 


The  raw  kineses  of  an  object  are  a  summary  of  its  movements  over  time. 
They  are  obtained  by  matching  the  feature  descriptors  of  the  object  using  the  six 
classes  of  heuristics  that  were  listed  above. 


4.3. 2. 2  The  Transformation  Primitives 


The  second  step  in  the  process  is  the  abstraction  of  primitive  motion  descrip¬ 
tions  from  these  correspondences.  Two  primitives  are  provided  from  which  all  higher 
level  descriptions  are  obtained:  "LOCATION-CHANGE"  -  change  of  location,  and  "PHYS- 
PROP-CHANGE"  -  change  of  physical  property. 


The  semantic  components  of  LOCATION-CHANGE  are  a  subject  which  exhibits 
the  motion,  a  start  and  an  end  time,  duration  ,  speed,  and  trajectory.  Trajectory  has 
components  in  the  same  directions  as  the  subject’s  longitudinal  and  lateral  axes,  rath¬ 
er  than  in  the  horizontal  and  vertical  directions.  The  intent  here  is  to  provide  a  set  of 
primitive  motion  descriptors  which  can  be  used  by  higher  levels  of  the  system  to  pro¬ 
duce  more  specific  motion  descriptions.  It  is  very  difficult  to  relate  x-y  stretch,  etc.,  to 
higher  level  natural  language  verbs  without  constantly  changing  the  co-ordinate  sys¬ 
tem.  However,  longitudinal  and  lateral  directions  relate  directly  to  higher  level  verbs. 


LOCATION-CHANGE  leads  directly  to  the  two  predicates  at  the  next  higher  lev¬ 
el  of  description:  "TRANSLATE"  and  "ROTATE”.  TRANSLATE  is  used  if  the  subject's 
cetroid  changes  location,  and  ROTATE  if  the  subject's  axes  change  orientation  Rota¬ 
tions  about  exterior  points  are  handled  by  the  verb  "GO-AROUND". 


The  semantic  components  of  PHYS-PRQP-CHANGE  are  a  subject  whose  proper¬ 
ty  is  changing,  the  property  that  changes,  temporal  relations,  (i.e.,  start,  end  times 
and  duration)  and  a  rate  of  change.  Depending  on  the  property,  one  of  several  more 
specific  PHYS-PRQP-CHANGE' s  may  be  used.  For  example,  "AREA-CHANGE"  is  used  for 
changes  in  the  area  property  of  an  object,  and  "STRETCH"  is  used  for  changes  in  an 
object’s  axis  length.  "SHAPE-CHANGE"  is  used  for  major  shape  changes.  It  is  also 
classified  further  into:  "DISTORT"  -  changes  in  relative  position  of  an  object’s  axes, 
"BUCKLE"  -  for  1-3  shape  feature  mappings,  "STRAIGHTEN"  -  for  n-1  shape  feature  map¬ 
pings,  and  "WARP"  -  for  all  other  shape  feature  mappings. 

For  example,  the  frame  tokens  which  would  define  the  changes  in  the  LV  as  a 
whole  between  two  consecutive  images  are: 

token  lv-motion-1  instance-of  LV-PHYS-PRQP-CHANGE  with 
centroid-change  :  translate-1  ; 
long-axis-change  :  stretch-1  ; 
iat-axis-change  :  stretch  -2  ; 
area-change  :  area-change  -1  ; 


end 
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Figure  13  An  Example  of  Shape  Feature  Matching 


outline  1  outline  2 

NOTE:  The  above  shows  two  consecutive  LV  outlines,  with  point  correspondencies  labelled 

as  determined  by  shape  feature  matching.  Notice  that  points  12',  13',  14',  and  15' 
show  an  example  of  the  "buckle"  transformation  primitive,  where  the  "cape"  formed  by 
them  protrudes  from  the  "lagoon"  formed  by  points  1  through  4  in  outline  1. 


I 
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token  translate-1  instance-of  TRANSLATE  with 
subject :  lvl  ; 
time-info  :  til  ; 
start-Loc  :  xy-coords-1  ; 
end-loc  :  xy-coords-2  ; 

end 


The  remainder  of  the  tokens  "stretch-1",  "stretch-2",  etc.  are  defined  in  analogous 
ways.  Frame  tokens  such  as  these  are  provided  not  only  for  the  LV  as  a  whole,  but  also 
for  each  of  its  parts  (LV-wall,  its  segments,  mitral  valve,  etc.),  in  accordance  with  the 
LV  PART-QF  object  hierarchy. 

The  set  of  primitive  motions  associated  with  an  object  between  2  consecutive 
images  at  all  levels  of  its  PART-OF  hierarchy  defines  the  essential  kineses  for  that  ob¬ 
ject. 


4.4  Matching  -  Comparing  the  Data  to  Hypotheses 


The  previous  section  described  how  the  lowest  level  descriptions  of  the  ob¬ 
jects  and  their  motions  are  computed.  These  descriptions  may  or  may  not  be  directly 
comparable  to  those  which  are  present  in  the  currently  hypothesized  frames.  There¬ 
fore  there  are  two  distinct  steps  which  need  to  be  performed. 


The  first  step  determines  whether  or  not  the  description  level  in  the  hy¬ 
pothesis  to  be  matched  against  is  the  same  as  that  of  the  description  derived  from  the 
observations.  This  can  be  done  simply  by  noting  the  level  of  descriptive  terms  used  in 
the  hypothesis  (their  level  in  the  IS-A  hierarchy).  If  they  are  not  at  the  lowest  level, 
then  a  bottom-up  re-computation  of  descriptions  must  be  done.  In  fact,  the  IS-A 
hierarchy  Is  searched  downwards  from  the  concepts  used  in  the  lowest  level  of  descrip¬ 
tion,  matching  against  their  "son"  frames  in  order  to  determine  which  of  them  is  appli¬ 
cable.  This  proceeds  until  the  proper  level  is  reached.  For  example,  if,  at  the  lowest 
level,  the  descriptive  term  is  "AREA-CHANGE",  and  the  hypothesis  uses  "CONTRACT", 
then  the  frames  below  "AREA-CHANGE"  are  checked  in  order  to  see  if  "CONTRACT"  lies 
below  it.  If  it  does  (and  it  does),  the  specified  constraints  are  checked  and,  if  satisfied, 
the  "CONTRACT"  term  is  added  to  description.  Failure  at  this  point  indicates  failure  of 
the  match  and  the  proper  exception  is  raised. 


Once  the  description  levels  match,  the  concepts  and  constraints  are  checked 
one  by  one  -  each  failure  raising  the  specifed  exception.  This  produces  a  (possibly  null) 
set  of  exception  records  for  the  hypothesis  which  was  being  matched. 


4.5  Change  of  Attention  Mechanism 


The  system  maintains  a  set  of  active  hypotheses  throughout  the  recognition 
process.  This  set  is  initially  filled  with  the  motion  hypothesis  the  user  believes  is  exhi¬ 
bited  by  a  particular  object  (or  objects). 


When  a  motion  hypothesis  is  activated,  several  others  are  also  automatically- 
activated: 

a)  any  hypotheses  -which  are  its  IS-A  ancestors, 

b)  any  hypotheses  for  the  current  time  interval  which  form  component  motions  of  the 
hypothesis. 

This  definition  is  applied  recursively. 

Active  hypotheses  are  organized  by  their  "conceptual  adjacency".  By 
definition,  two  hypotheses  H  and  H’  are  adjacent  conceptually,  if  one  (or  more)  of  the 
following  hold: 

A)  There  is  an  active  similarity  link  between  H  and  H\  and  H  and  H’  define  motion 
concepts  with  a  common  time  interval  for  the  same  motion  subjects.  They  are  adja¬ 
cent  only  for  that  common  time  interval.  This  latter  condition  will  be  referred  to  as 
the  "common  subject-time  condition”. 

B)  H  IS-A  H”  and  H’  IS-A  H”  (i.e.,  they  have  a  common  direct  IS-A  ancestor).  This 
means  that  the  hypotheses  are  mutually  exclusive.  The  common  subject-time  condi¬ 
tion  must  hold. 

C)  H  IS-A  H‘  and  the  common  subject-time  condition  holds. 

D)  H  preceeds  H’  in  time  or  H  follows  H’  in  time,  and  they  both  define  motions  for 
the  same  motion  subjects. 

E)  H  HAS-AS-PART  H’  and  they  define  motions  for  the  same  time  interval. 

F)  H  is  adjacent  to  itself  for  the  next  time  frame. 


The  adjacency  types  will  be  referred  to  by  the  letters  which  preceeded  them 
above.  This  organization  provides  the  basis  for  the  focus  of  attention  mechanism  to  be 
described  in  the  next  section. 


The  system  adds  hypotheses  to  its  list  of  active  ones  via  activated  similarity 
links  and  via  the  "next"  temporal  constraint,  (i.e.,  if  motion  B  is  the  "next"  one  after  A 
motion  A  has  an  end  time  which  is  the  same  as  motion  B’s  start  time). 


Similarity  links  are  activated  when  one  of  the  exceptions  specified  in  the 
exception-expression  is  satisfied  and  its  hypothesis  compatibility  expression  is  also 
satisfied.  There  are  several  additional  considerations  that  arise  when  a  hypothesis  is 
added  by  a  similarity  link: 

a)  When  a  particular  exception  cannot  be  handled  by  the  local  similarity  links, 
the  links  of  its  IS-A  ancestors  are  checked. 

b)  Since  activation  of  a  frame  automatically  activates  its  IS-A  ancestors,  transfer 
of  parts  between  the  two  frames’  IS-A  ancestors  may  be  accomplished  by  the  similarity 
links  (if  present)  between  those  frames. 

c)  There  is  a  need  to  also  propagate  similarities  upwards  along  the  PART-OF 
hierarchy,  because  possible  mis-matches  of  a  motion  component  may  require  com¬ 
pletely  new  contexts  for  the  newly  activated  parts.  This  is  done  by  automatically  rais- 
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mg  an  exception  in  the  parent  hypothesis.  The  similarity  link  for  this  exception  must 
specify  in  its  compatibility  expression  that  the  PART-OF  relation  must  hold  between  the 
new  hypothesis  activated  and  any  new  hypotheses  to  be  activated  as  its  possible 
parents.  This  information  is  transferred  by  the  exception  record. 


The  "next"  temporal  constraint  is  used  in  the  following  way.  When  the  current 
motion  hypothesis  reaches  its  specified  minimum  end  time,  its  "next"  motion  hy¬ 
pothesis  for  the  same  subject  is  activated.  The  focus  of  attention  mechanism  then  aids 
in  the  segmentation  between  the  two  hypotheses  in  order  to  find  their  time  boundary. 


Hypotheses  are  deleted  from  the  active  list  in  two  ways.  If  the  hypothesis’  ex¬ 
pected  maximum  duration  is  exceeded,  it  is  removed.  If  the  values  of  its  current  slots 
can  be  found,  it  is  instantiated  (added  to  the  description)  and  its  certainty  factor  com¬ 
puted  (see  next  section).  If  the  slots  cannot  be  filled  it  is  abandoned  completely.  The 
second  way  of  removing  hypotheses  is  by  simple  thresholding.  Once  a  hypothesis’  cer¬ 
tainty  falls  below  some  fraction  (say,  one-third)  of  the  leading  hypothesis’  certainty,  it 
is  deleted.  The  hypothesis  must  of  course  satisfy  the  common  subject-time  condition. 
Deletion  of  a  hypothesis  is  propagated  to  all  of  its  parts,  all  parent  subjects,  and  all  IS-A 
sons  (recursively). 


4.6  Focus  of  Attention  Mechanism 


The  set  of  active  hypotheses  requires  further  organization.  The  prediction 
mechanism  needs  to  know  which  are  the  best  hypotheses  at  any  instant  in  the  process¬ 
ing,  so  that  it  can  base  its  expectations  on  them.  The  hypotheses  will  be  ordered  by 
means  of  the  certainty  which  the  system  has  in  them.  On  activation,  all  hypotheses  are 
given  a  50  -  50  chance  -  a  certainty  of  0.5.  If  any  a  priori  information  is  known  about 
most  likely  patient  diagnosis,  the  information  is  used  at  this  point.  For  each  subse¬ 
quent  image  ,  this  certainty  is  updated  using  the  paradigm  of  competition  and  co¬ 
operation  among  conceptually  adjacent  hypotheses. 


The  updating  rule  is  the  following: 
cert(k+l,H)  =  cert(k,H)  *  [1  +  q(H)]  /  norm(k) 


where:  q(H)  is  the  contribution  from  neighbours  and 
is  given  by  : 

q(H)  =  sum  over  H’  of  £w(H,H',k)  *  comp(H,H',k)  *  cert(k,H')} 

norm(H)  is  the  normalizing  factor  and  is  given 
by: 

norm(k)  =  sum  over  H’  of  £  cert(k.H’)  *  [l+q(H’)]  ] 

w(H,H’,k)  is  weight  of  contribution  for  hypothesis 
H  from  H’  at  time  instant  k.  Sum  of  all  w(H,H',k) 
is  one. 

comp(H,H’,k)  is  compatibility  of  hypothesis  H  with  H’ 
at  time  instant  k. 
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The  compatibilty  between  two  conceptually  adjacent  hypotheses  depends 
directlty  on  the  type  of  adjacency  they  exhibit.  These  adjacency  types  are  now  listed 
again  and  the  compatibilities  for  each  discussed.  Note  that  type  B  takes  priority  over  A 
for  the  assignement  of  compatibilities  if  both  are  present  for  a  particular  hypothesis. 
No  other  combinations  are  possible. 


A)  (H  is  similar  to  H’)  There  are  two  possibilities  here,  depending  on  whether 
exceptions  arise  during  matching.  If  no  exceptions  arise,  then  comp(H,H\k)  =  -0.5  for 
all  k.  If  at  any  k  an  exception  occurs  in  H’  which  satisfies  an  element  of  the  exception 
expression  of  the  similarity  link  joining  H  and  H\  comp(H,H’,k)  =  +0.2.  In  other  words, 
if  there  are  very  few  exceptions,  the  two  hypotheses  are  relatively  incompatible  be¬ 
cause  they  are  alternatives.  However,  if  exceptions  occur  in  H\  H  is  more  likely  to  be 
true  and  should  be  supported  by  H\  In  addition,  for  that  case  comp(H’,H,k)  =  -0.5,  so 
that  H  detracts  support  from  H’. 


B)  (H  and  H’  have  a  common  direct  IS-A  ancestor)  These  two  hypotheses  are 
mutually  exclusive  and  therefore  are  incompatible  (comp(H,H’,k)  =  -1.0). 


C)  (H  IS-A  H’)  These  hypotheses  are  completely  compatible.  If  H  is  true  then 
H’  is  also  true  (comp(H,H’,k)  =  1.0).  However,  the  inverse  is  not  necessarily  true  so 
comp(H’,H,k)  =  0.5. 


D)  (H  follows  H’  or  H  preceeds  H*  in  time)  The  compatibility  is  set  in  this  case 
so  that  the  time  intervals  expected  by  each  hypothesis  may  be  segmented. 
(comp(H,H’,k)  =  -0.8). 


E)  (H  HAS  AS  PART  H’)  Hypothesis  H  is  made  up  of  parts  H‘  and  each  part  con¬ 
tributes  to  the  confidence  of  H.  (comp(H,H’,k)  =  0.7). 


F)  (H  with  itself  in  time  instant  k+l)  This  compatibility  provides  matching 
feedback  for  H  itself.  If  matching  exceptions  occur  in  H,  comp(H,H,k)  =  -0.5.  Other¬ 
wise  comp(H,H,k)  =  0.0. 


The  weighting  factor  assigned  to  each  contributing  hypothesis  depends  on  the 
number  of  active  adjacent  hypotheses  and  their  types.  A  total  weight  of  one-third  is 
shared  among  hypotheses  of  types  D,  E,  and  F  (local  evaluation).  One-third  is  split 
among  the  hypotheses  of  types  B  and  C  (IS-A  constraint),  while  the  final  third  is  divid¬ 
ed  between  type  A  hypotheses  (similarities).  The  sum  of  the  weights  must  be  one. 


4. 7  Expectation  Generation 


Using  the  current  "best"  motion  hypotheses  for  each  object  in  the  image,  an 
expected  contour  for  it  in  the  next  image  may  be  produced  based  on  the  contour  in  the 
previous  image.  To  start  this  process,  it  is  assumed  that  in  the  first  image  of  the  se¬ 
quence,  a  conceptual  description  is  given  for  each  object  in  the  image.  New  objects  are 
not  allowed  to  enter  into  the  image.  This  conceptual  description  consists  of  the  com- 


plete  contour  of  each  object  with  all  of  its  parts  labelled,  so  that  the  instantiation  of  its 
corresponding  object  frame  is  a  trivial  matter. 


The  predicted  outline  may  be  determined  in  one  of  two  ways.  The  prediction 
may  simply  be  the  contour  in  the  previous  image  along  with  a  search  window  of  width 
equal  to  twice  the  maximum  allowed  displacment  for  any  point  in  the  problem  domain. 
However,  in  many  cases,  more  information  is  known  about  the  motions. 


If  the  hypothesis  contains  the  expected  motion  parameters,  they  may  be  ap¬ 
plied  to  the  current  contour  to  derive  the  new  one,  and  the  range  of  these  parameters 
will  give  the  search  window.  In  addition,  if  several  hypotheses  are  considered  "good", 
then  the  union  of  their  predicted  search  regions  is  used  as  the  region  while  the  best 
contour  is  used  as  guidance  for  edge  orientation. 


Clearly,  predictions  are  only  provided  for  objects  whose  motion  capability  is 
at  least  "mobile’'.  "Fixed"  objects  are  assumed  forever  stationary.  If  a  mobile  object 
does  not  have  an  active  motion  hypothesis  associated  with  it,  (i.e.,  it  has  not  yet 
moved),  the  expectation  for  it  is  derived  by  the  simple  method  described  above. 


5.0  CONCLUSIONS 


This  paper  presented  a  framework  for  the  analysis  of  motion  which  includes: 

a)  A  frame-based  representation  scheme  for  motion  concepts, 

b)  A  control  structure  for  the  recognition  of  motion  concepts  which  uses  the 
representational  constructs  in  terms  of  which  the  system’s  knowledge  base  has  been 
defined,  particularly  similarity  links  as  well  as  the  IS-A  and  PART-OF  relationships. 


The  framework  has  been  applied  to  the  analysis  of  cine  car  dioangiograms  and 
a  discussion  of  how  it  would  function  when  presented  with  actual  data  was  also  includ¬ 
ed.  The  framework  is  based  on  current  research  related  to  knowledge  representation 
and  control  structures  within  AI.  We  consider  as  contributions  the  following  features  of 
the  framework: 

a)  From  a  knowledge  representation  point  of  view,  the  presence  of  similarity 
links,  their  interaction  with  the  IS-A  and  PART-OF  relationships,  the  representation  of 
motion  concepts,  and  their  use  by  the  recognition  control  structure. 

b)  From  a  control  structures  point  of  view,  the  use  of  the  feedback  control 
mechanism  proposed  by  [Zucker78b]  in  conjunction  with  a  knowledge  base  which  gen¬ 
erates  expectations  and  limits  the  number  of  possibilities  that  need  to  be  searched  for. 


Currently,  we  are  in  the  final  stages  of  a  prototype  implementation  of  ALVEN. 
The  low  level  modules  (essential  trace  and  kineses)  have  been  implemented  and  are  be¬ 
ing  tested.  The  data  collection  system  for  determination  of  LV  wall  velocity  profiles  has 
been  implemented  [Delgrande79]  and  films  are  being  entered  into  a  data  base.  A  syn¬ 
tax  has  been  defined  for  the  representation  formalism  and  a  compiler  has  been  imple¬ 
mented  [Norman78]  for  entering  and  maintaining  the  knowledge  base. 


The  above  conceptual  structure  is  intuitively  correct  and  seems  to  function 
for  stylized  problems.  In  particular,  the  mechanism  seems  to  be  sufficient  for  the 
analysis  of  cinecar dioangiograms,  although  further  testing  is  necessary.  Once  a  com¬ 
plete  system  has  been  implemented  and  experimented  with,  we  will  know  what 
refinements  are  necessary  to  the  proposed  structures  so  that  the  framework  may  also 
be  applicable  to  other  domains. 
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